Rank | Count | Beginning |
---|---|---|
2498 | 2390 | اس |
17881 | 1042 | دے |
213 | 770 | ایہ |
28578 | 733 | وچ |
7298 | 705 | انہاں |
12985 | 577 | تے |
8645 | 548 | اوہ |
16363 | 339 | دا |
16359 | 325 | دی |
28046 | 301 | نے |
6035 | 296 | اک |
35 | 272 | ، |
679 | 244 | آپ |
211 | 219 | ایہہ |
1261 | 204 | اُتے |
24827 | 196 | لیکن |
14014 | 194 | جدوں |
16361 | 185 | دی |
12761 | 150 | توں |
27767 | 150 | نوں |
14352 | 140 | جس |
41 | 128 | یہ |
9711 | 116 | ايسے |
14846 | 115 | جو |
15100 | 108 | جے |
26444 | 106 | مگر |
9829 | 103 | اے |
23057 | 103 | کہ |
7296 | 99 | ایہناں |
22449 | 94 | فیر |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV